Towards multi-domain speech understanding using a two-stage recognizer
نویسندگان
چکیده
This paper describes our eeorts in designing a two-stage recognizer with the objective of developing a multi-domain speech understanding system. We envisage one rst-stage recognition engine that is domain-independent, and multiple second-stage systems specializing in individual domains. A major novelty in our initial two-stage design is a front-end that incorporates angie-based hierarchical sublexical probability models encapsulated within a nite-state transducer (FST) paradigm. This rst stage is a context-dependent syllable-level recognizer which outputs acoustic-phonetic networks to be processed in a second pass. The second stage incorporates higher order linguistic knowledge, from phonological to syntactic and semantic, in a tightly coupled search. This system has yielded up to a 28.5% reduction in understanding error, compared with a single stage context-dependent recog-nizer which does not use angie-based probabilities.
منابع مشابه
An Effective Speech Understanding Method with a Multiple Speech Recognizer based on Output Selection using Edit Distance
In this paper, we propose a simple and effective method for speech understanding. The method incorporates some speech recognizers. We use two recognizers, a large vocabulary continuous speech recognizer and a domain-specific speech recognizer. The integrated recognizer is a robust and flexible method for speech understanding. For the integration process, we use a simple edit distance measure of...
متن کاملGender-dependent emotion recognition based on HMMs and SPHMMs
It is well known that emotion recognition performance is not ideal. The work of this research is devoted to improving emotion recognition performance by employing a two-stage recognizer that combines and integrates gender recognizer and emotion recognizer into one system. Hidden Markov Models (HMMs) and Suprasegmental Hidden Markov Models (SPHMMs) have been used as classifiers in the two-stage ...
متن کاملFST-based recognition techniques for multi-lingual and multi-domain spontaneous speech
In this paper we present techniques for building multi-domain and multi-lingual recognizers within a finite-state transducer (FST) framework. The flexibility of the FST approach is also demonstrated on the task of incorporating networks modeling different types of non-speech events into an existing word lattice network. The ability to create robust multi-domain and/or multi-lingual recognizers ...
متن کاملHybrid Statistical And Structural Semantic Modeling For Thai Multi-Stage Spoken Language Understanding
This article proposes a hybrid statistical and structural semantic model for multi-stage spoken language understanding (SLU). The first stage of this SLU utilizes a weighted finite-state transducer (WFST)-based parser, which encodes the regular grammar of concepts to be extracted. The proposed method improves the regular grammar model by incorporating a well-known n-gram semantic tagger. This h...
متن کاملReducing Search by Partitioning the Word Network
Information is passed between the two systems using the Word Network (or Word Lattice) which is a set of word-score pairs together with the start and end points for each word. The word network is organized as a directed acyclic graph, whose arcs are labeled as word-score pairs, and whose nodes are moments in time. The recognition problem is to find the best scoring grammatical sequence of words...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999